910 research outputs found

    SentiCap: Generating Image Descriptions with Sentiments

    Full text link
    The recent progress on image recognition and language modeling is making automatic description of image content a reality. However, stylized, non-factual aspects of the written description are missing from the current systems. One such style is descriptions with emotions, which is commonplace in everyday communication, and influences decision-making and interpersonal relationships. We design a system to describe an image with emotions, and present a model that automatically generates captions with positive or negative sentiments. We propose a novel switching recurrent neural network with word-level regularization, which is able to produce emotional image captions using only 2000+ training sentences containing sentiments. We evaluate the captions with different automatic and crowd-sourcing metrics. Our model compares favourably in common quality metrics for image captioning. In 84.6% of cases the generated positive captions were judged as being at least as descriptive as the factual captions. Of these positive captions 88% were confirmed by the crowd-sourced workers as having the appropriate sentiment

    Automatic Image Captioning with Style

    Get PDF
    This thesis connects two core topics in machine learning, vision and language. The problem of choice is image caption generation: automatically constructing natural language descriptions of image content. Previous research into image caption generation has focused on generating purely descriptive captions; I focus on generating visually relevant captions with a distinct linguistic style. Captions with style have the potential to ease communication and add a new layer of personalisation. First, I consider naming variations in image captions, and propose a method for predicting context-dependent names that takes into account visual and linguistic information. This method makes use of a large-scale image caption dataset, which I also use to explore naming conventions and report naming conventions for hundreds of animal classes. Next I propose the SentiCap model, which relies on recent advances in artificial neural networks to generate visually relevant image captions with positive or negative sentiment. To balance descriptiveness and sentiment, the SentiCap model dynamically switches between two recurrent neural networks, one tuned for descriptive words and one for sentiment words. As the first published model for generating captions with sentiment, SentiCap has influenced a number of subsequent works. I then investigate the sub-task of modelling styled sentences without images. The specific task chosen is sentence simplification: rewriting news article sentences to make them easier to understand. For this task I design a neural sequence-to-sequence model that can work with limited training data, using novel adaptations for word copying and sharing word embeddings. Finally, I present SemStyle, a system for generating visually relevant image captions in the style of an arbitrary text corpus. A shared term space allows a neural network for vision and content planning to communicate with a network for styled language generation. SemStyle achieves competitive results in human and automatic evaluations of descriptiveness and style. As a whole, this thesis presents two complete systems for styled caption generation that are first of their kind and demonstrate, for the first time, that automatic style transfer for image captions is achievable. Contributions also include novel ideas for object naming and sentence simplification. This thesis opens up inquiries into highly personalised image captions; large scale visually grounded concept naming; and more generally, styled text generation with content control

    UNIPoint: Universally Approximating Point Processes Intensities

    Full text link
    Point processes are a useful mathematical tool for describing events over time, and so there are many recent approaches for representing and learning them. One notable open question is how to precisely describe the flexibility of point process models and whether there exists a general model that can represent all point processes. Our work bridges this gap. Focusing on the widely used event intensity function representation of point processes, we provide a proof that a class of learnable functions can universally approximate any valid intensity function. The proof connects the well known Stone-Weierstrass Theorem for function approximation, the uniform density of non-negative continuous functions using a transfer functions, the formulation of the parameters of a piece-wise continuous functions as a dynamic system, and a recurrent neural network implementation for capturing the dynamics. Using these insights, we design and implement UNIPoint, a novel neural point process model, using recurrent neural networks to parameterise sums of basis function upon each event. Evaluations on synthetic and real world datasets show that this simpler representation performs better than Hawkes process variants and more complex neural network-based approaches. We expect this result will provide a practical basis for selecting and tuning models, as well as furthering theoretical work on representational complexity and learnability

    Dynamics Inside the Radio and X-ray Cluster Cavities of Cygnus A and Similar FRII Sources

    Full text link
    We describe approximate axisymmetric computations of the dynamical evolution of material inside radio lobes and X-ray cluster gas cavities in Fanaroff-Riley II sources such as Cygnus A. All energy is delivered by a jet to the lobe/cavity via a moving hotspot where jet energy dissipates in a reverse shock. Our calculations describe the evolution of hot plasma, cosmic rays (CRs) and toroidal magnetic fields flowing from the hotspot into the cavity. Many observed features are explained. Gas, CRs and field flow back along the cavity surface in a "boundary backflow" consistent with detailed FRII observations. Computed ages of backflowing CRs are consistent with observed radio-synchrotron age variations only if shear instabilities in the boundary backflow are damped and we assume this is done with viscosity of unknown origin. Magnetic fields estimated from synchrotron self-Compton (SSC) X-radiation observed near the hotspot evolve into radio lobe fields. Computed profiles of radio synchrotron lobe emission perpendicular to the jet are dramatically limb-brightened in excellent agreement with FRII observations although computed lobe fields exceed those observed. Strong winds flowing from hotspots naturally create kpc-sized spatial offsets between hotspot inverse Compton (IC-CMB) X-ray emission and radio synchrotron emission that peaks 1-2 kpc ahead where the field increases due to wind compression. In our computed version of Cygnus A, nonthermal X-ray emission increases from the hotspot (some IC-CMB, mostly SSC) toward the offset radio synchrotron peak (mostly SSC). A faint thermal jet along the symmetry axis may be responsible for redirecting the Cygnus A non-thermal jet.Comment: 24 pages, 10 figures, accepted by Ap

    Packing 3-vertex paths in claw-free graphs and related topics

    Get PDF
    An L-factor of a graph G is a spanning subgraph of G whose every component is a 3-vertex path. Let v(G) be the number of vertices of G and d(G) the domination number of G. A claw is a graph with four vertices and three edges incident to the same vertex. A graph is claw-free if it has no induced subgraph isomorphic to a claw. Our results include the following. Let G be a 3-connected claw-free graph, x a vertex in G, e = xy an edge in G, and P a 3-vertex path in G. Then (a1) if v(G) = 0 mod 3, then G has an L-factor containing (avoiding) e, (a2) if v(G) = 1 mod 3, then G - x has an L-factor, (a3) if v(G) = 2 mod 3, then G - {x,y} has an L-factor, (a4) if v(G) = 0 mod 3 and G is either cubic or 4-connected, then G - P has an L-factor, (a5) if G is cubic with v(G) > 5 and E is a set of three edges in G, then G - E has an L-factor if and only if the subgraph induced by E in G is not a claw and not a triangle, (a6) if v(G) = 1 mod 3, then G - {v,e} has an L-factor for every vertex v and every edge e in G, (a7) if v(G) = 1 mod 3, then there exist a 4-vertex path N and a claw Y in G such that G - N and G - Y have L-factors, and (a8) d(G) < v(G)/3 +1 and if in addition G is not a cycle and v(G) = 1 mod 3, then d(G) < v(G)/3. We explore the relations between packing problems of a graph and its line graph to obtain some results on different types of packings. We also discuss relations between L-packing and domination problems as well as between induced L-packings and the Hadwiger conjecture. Keywords: claw-free graph, cubic graph, vertex disjoint packing, edge disjoint packing, 3-vertex factor, 3-vertex packing, path-factor, induced packing, graph domination, graph minor, the Hadwiger conjecture.Comment: 29 page

    AVS Corner, May 2018

    Get PDF
    • …
    corecore